Goto

Collaborating Authors

 product name


Multi-dimensional Data Analysis and Applications Basing on LLM Agents and Knowledge Graph Interactions

Wang, Xi, Ling, Xianyao, Li, Kun, Yin, Gang, Zhang, Liang, Wu, Jiang, Xu, Jun, Zhang, Fu, Lei, Wenbo, Wang, Annie, Gong, Peng

arXiv.org Artificial Intelligence

In the current era of big data, extracting deep insights from massive, heterogeneous, and complexly associated multi-dimensional data has become a significant challenge. Large Language Models (LLMs) perform well in natural language understanding and generation, but still suffer from "hallucination" issues when processing structured knowledge and are difficult to update in real-time. Although Knowledge Graphs (KGs) can explicitly store structured knowledge, their static nature limits dynamic interaction and analytical capabilities. Therefore, this paper proposes a multi-dimensional data analysis method based on the interactions between LLM agents and KGs, constructing a dynamic, collaborative analytical ecosystem. This method utilizes LLM agents to automatically extract product data from unstructured data, constructs and visualizes the KG in real-time, and supports users in deep exploration and analysis of graph nodes through an interactive platform. Experimental results show that this method has significant advantages in product ecosystem analysis, relationship mining, and user-driven exploratory analysis, providing new ideas and tools for multi-dimensional data analysis.


Question-to-Knowledge (Q2K): Multi-Agent Generation of Inspectable Facts for Product Mapping

Seo, Wonduk, Shin, Taesub, An, Hyunjin, Kim, Dokyun, Lee, Seunghyun

arXiv.org Artificial Intelligence

Identifying whether two product listings refer to the same Stock Keeping Unit (SKU) is a persistent challenge in ecommerce, especially when explicit identifiers are missing and product names vary widely across platforms. Rule based heuristics and keyword similarity often misclassify products by overlooking subtle distinctions in brand, specification, or bundle configuration. To overcome these limitations, we propose Question to Knowledge (Q2K), a multi agent framework that leverages Large Language Models (LLMs) for reliable SKU mapping. Q2K integrates: (1) a Reasoning Agent that generates targeted disambiguation questions, (2) a Knowledge Agent that resolves them via focused web searches, and (3) a Deduplication Agent that reuses validated reasoning traces to reduce redundancy and ensure consistency. A human in the loop mechanism further refines uncertain cases. Experiments on real world consumer goods datasets show that Q2K surpasses strong baselines, achieving higher accuracy and robustness in difficult scenarios such as bundle identification and brand origin disambiguation. By reusing retrieved reasoning instead of issuing repeated searches, Q2K balances accuracy with efficiency, offering a scalable and interpretable solution for product integration.


To See or To Read: User Behavior Reasoning in Multimodal LLMs

Dong, Tianning, Ma, Luyi, Vasudevan, Varun, Cho, Jason, Kumar, Sushant, Achan, Kannan

arXiv.org Artificial Intelligence

Multimodal Large Language Models (MLLMs) are reshaping how modern agentic systems reason over sequential user-behavior data. However, whether textual or image representations of user behavior data are more effective for maximizing MLLM performance remains underexplored. We present \texttt{BehaviorLens}, a systematic benchmarking framework for assessing modality trade-offs in user-behavior reasoning across six MLLMs by representing transaction data as (1) a text paragraph, (2) a scatter plot, and (3) a flowchart. Using a real-world purchase-sequence dataset, we find that when data is represented as images, MLLMs next-purchase prediction accuracy is improved by 87.5% compared with an equivalent textual representation without any additional computational cost.


Contextually Aware E-Commerce Product Question Answering using RAG

Tangarajan, Praveen, Rajasekar, Anand A., Rathi, Manish, Dandin, Vinay Rao, Ersoy, Ozan

arXiv.org Artificial Intelligence

E-commerce product pages contain a mix of structured specifications, unstructured reviews, and contextual elements like personalized offers or regional variants. Although informative, this volume can lead to cognitive overload, making it difficult for users to quickly and accurately find the information they need. Existing Product Question Answering (PQA) systems often fail to utilize rich user context and diverse product information effectively. We propose a scalable, end-to-end framework for e-commerce PQA using Retrieval Augmented Generation (RAG) that deeply integrates contextual understanding. Our system leverages conversational history, user profiles, and product attributes to deliver relevant and personalized answers. It adeptly handles objective, subjective, and multi-intent queries across heterogeneous sources, while also identifying information gaps in the catalog to support ongoing content improvement. We also introduce novel metrics to measure the framework's performance which are broadly applicable for RAG system evaluations.


Flippi: End To End GenAI Assistant for E-Commerce

Rajasekar, Anand A., Tangarajan, Praveen, Nainani, Anjali, Batwal, Amogh, Dandin, Vinay Rao, Trivedi, Anusua, Ersoy, Ozan

arXiv.org Artificial Intelligence

The emergence of conversational assistants has fundamentally reshaped user interactions with digital platforms. This paper introduces Flippi-a cutting-edge, end-to-end conversational assistant powered by large language models (LLMs) and tailored for the e-commerce sector. Flippi addresses the challenges posed by the vast and often overwhelming product landscape, enabling customers to discover products more efficiently through natural language dialogue. By accommodating both objective and subjective user requirements, Flippi delivers a personalized shopping experience that surpasses traditional search methods. This paper details how Flippi interprets customer queries to provide precise product information, leveraging advanced NLP techniques such as Query Reformulation, Intent Detection, Retrieval-Augmented Generation (RAG), Named Entity Recognition (NER), and Context Reduction. Flippi's unique capability to identify and present the most attractive offers on an e-commerce site is also explored, demonstrating how it empowers users to make cost-effective decisions. Additionally, the paper discusses Flippi's comparative analysis features, which help users make informed choices by contrasting product features, prices, and other relevant attributes. The system's robust architecture is outlined, emphasizing its adaptability for integration across various e-commerce platforms and the technological choices underpinning its performance and accuracy. Finally, a comprehensive evaluation framework is presented, covering performance metrics, user satisfaction, and the impact on customer engagement and conversion rates. By bridging the convenience of online shopping with the personalized assistance traditionally found in physical stores, Flippi sets a new standard for customer satisfaction and engagement in the digital marketplace.


ConECT Dataset: Overcoming Data Scarcity in Context-Aware E-Commerce MT

Pokrywka, Mikołaj, Kusa, Wojciech, Rutkowski, Mieszko, Koszowski, Mikołaj

arXiv.org Artificial Intelligence

Neural Machine Translation (NMT) has improved translation by using Transformer-based models, but it still struggles with word ambiguity and context. This problem is especially important in domain-specific applications, which often have problems with unclear sentences or poor data quality. Our research explores how adding information to models can improve translations in the context of e-commerce data. To this end we create ConECT -- a new Czech-to-Polish e-commerce product translation dataset coupled with images and product metadata consisting of 11,400 sentence pairs. We then investigate and compare different methods that are applicable to context-aware translation. We test a vision-language model (VLM), finding that visual context aids translation quality. Additionally, we explore the incorporation of contextual information into text-to-text models, such as the product's category path or image descriptions. The results of our study demonstrate that the incorporation of contextual information leads to an improvement in the quality of machine translation. We make the new dataset publicly available.


AWML: An Open-Source ML-based Robotics Perception Framework to Deploy for ROS-based Autonomous Driving Software

Tanaka, Satoshi, Thapa, Samrat, Tan, Kok Seang, Szymko, Amadeusz, Kenzo, Lobos, Minoda, Koji, Tomie, Shintaro, Uetake, Kotaro, Zhang, Guolong, Yamashita, Isamu, Horibe, Takamasa

arXiv.org Artificial Intelligence

In recent years, machine learning technologies have played an important role in robotics, particularly in the development of autonomous robots and self-driving vehicles. As the industry matures, robotics frameworks like ROS 2 have been developed and provides a broad range of applications from research to production. In this work, we introduce AWML, a framework designed to support MLOps for robotics. AWML provides a machine learning infrastructure for autonomous driving, supporting not only the deployment of trained models to robotic systems, but also an active learning pipeline that incorporates auto-labeling, semi-auto-labeling, and data mining techniques.


Text classification using machine learning methods

Oancea, Bogdan

arXiv.org Artificial Intelligence

In this paper we present the results of an experiment aimed to use machine learning methods to obtain models that can be used for the automatic classification of products. In order to apply automatic classification methods, we transformed the product names from a text representation to numeric vectors, a process called word embedding. We used several embedding methods: Count Vectorization, TF-IDF, Word2Vec, FASTTEXT, and GloVe. Having the product names in a form of numeric vectors, we proceeded with a set of machine learning methods for automatic classification: Logistic Regression, Multinomial Naive Bayes, kNN, Artificial Neural Networks, Support Vector Machines, and Decision trees with several variants. The results show an impressive accuracy of the classification process for Support Vector Machines, Logistic Regression, and Random Forests. Regarding the word embedding methods, the best results were obtained with the FASTTEXT technique.


Zero-Shot ATC Coding with Large Language Models for Clinical Assessments

Chen, Zijian, Gamble, John-Michael, Jantzi, Micaela, Hirdes, John P., Lin, Jimmy

arXiv.org Artificial Intelligence

Manual assignment of Anatomical Therapeutic Chemical (ATC) codes to prescription records is a significant bottleneck in healthcare research and operations at Ontario Health and InterRAI Canada, requiring extensive expert time and effort. To automate this process while maintaining data privacy, we develop a practical approach using locally deployable large language models (LLMs). Inspired by recent advances in automatic International Classification of Diseases (ICD) coding, our method frames ATC coding as a hierarchical information extraction task, guiding LLMs through the ATC ontology level by level. We evaluate our approach using GPT-4o as an accuracy ceiling and focus development on open-source Llama models suitable for privacy-sensitive deployment. Testing across Health Canada drug product data, the RABBITS benchmark, and real clinical notes from Ontario Health, our method achieves 78% exact match accuracy with GPT-4o and 60% with Llama 3.1 70B. We investigate knowledge grounding through drug definitions, finding modest improvements in accuracy. Further, we show that fine-tuned Llama 3.1 8B matches zero-shot Llama 3.1 70B accuracy, suggesting that effective ATC coding is feasible with smaller models. Our results demonstrate the feasibility of automatic ATC coding in privacy-sensitive healthcare environments, providing a foundation for future deployments.


Benchmarking pre-trained text embedding models in aligning built asset information

Shahinmoghadam, Mehrzad, Motamedi, Ali

arXiv.org Artificial Intelligence

Accurate mapping of the built asset information to established data classification systems and taxonomies is crucial for effective asset management, whether for compliance at project handover or ad-hoc data integration scenarios. Due to the complex nature of built asset data, which predominantly comprises technical text elements, this process remains largely manual and reliant on domain expert input. Recent breakthroughs in contextual text representation learning (text embedding), particularly through pre-trained large language models, offer promising approaches that can facilitate the automation of cross-mapping of the built asset data. However, no comprehensive evaluation has yet been conducted to assess these models' ability to effectively represent the complex semantics specific to built asset technical terminology. This study presents a comparative benchmark of state-of-the-art text embedding models to evaluate their effectiveness in aligning built asset information with domain-specific technical concepts. Our proposed datasets are derived from two renowned built asset data classification dictionaries. The results of our benchmarking across six proposed datasets, covering three tasks of clustering, retrieval, and reranking, highlight the need for future research on domain adaptation techniques. The benchmarking resources are published as an open-source library, which will be maintained and extended to support future evaluations in this field.